Reanálise de dados

Marcel Ferreira

4/18/23

Outline

  • Motivação

  • Bases de dados

  • GEO2R

  • ArrayExpress

  • TCGA

About me

  • Medical Physicist (2011-2014);

  • Master of Science in Biotechnology (2017);

  • PhD in Biotechnology (2023);

About me

I started in 2013 during my undergraduate degree with protein structure.

<<<<<<< HEAD

Experimental SAXS curve of SCLam and fitting procedure with simulated curves.

=======

Experimental SAXS curve of SCLam and fitting procedure with simulated curves.

>>>>>>> b716713c5ef6e2375f00da8a0081abfb6040d5b2

Experimental SAXS curve of SCLam and fitting procedure with simulated curves.

Aboute me

  • Since 2015 I work focused on bone tissue, especially with cell-biomaterial interaction;

  • Masters: Kinome;

  • PhD: Transcriptome/Proteome/Epigenome;

Data

<<<<<<< HEAD

Data to wisdom

Data

Data to conspiracy

Bioinformatics

Data science vs Bioinformatics.

NIH: “Bioinformatics, as related to genetics and genomics, is a scientific subdiscipline that involves using computer technology to collect, store, analyze and disseminate biological data and information, such as DNA and amino acid sequences or annotations about those sequences.”

Biological data

Biological data - Databases

Consortiums

Biological data - Databases

  • These are biological science (Online) libraries, collected from scientific experiments, published literature, high-throughput experiment technology, and computational analysis;

  • Structured annotations!

  • Provide access to data programmatically (API);

  • Data is made freely available under certain licenses;

Biological data - Experiments

Gene Expression Omnibus

ArrayExpress

Microarrays, RNA-Seq, miRNA-Seq, Proteomics (few), CHIP-seq, ATAC-seq, Kinase Array, Single Cell, …

Biological data

Why you MUST publish your data from high-throughput experiments?

  • It allows reuse by the scientific community;

    • Reduction of waste: Money, reagents, LIVES;
  • Peer-review;

    • Open-data/Open science;

    • Good practices;

RNAseq

Advancing RNA-Seq analysis, Nature Biotechnology 28:421-423

GEO-GEO2R

  • Gene Expression Omnibus;

  • NCBI-NIH;

  • About 4350 datasets;

  • Multiple species;

  • Array- and sequence-based data are accepted;

Gene Expression Omnibus

GEO-GEO2R

Gene Expression Omnibus

Thanks you!

Marcel Ferreira, PhD (@marceelrf)

https://quartodomarcel.netlify.app/

https://marcel-ferreira.shinyapps.io/SciDashboard_marceelrf/

References

  • https://learn.gencore.bio.nyu.edu/rna-seq-analysis/

  • https://star-protocols.cell.com/protocols/931

  • https://statquest.org/

  • https://www.youtube.com/watch?v=tlf6wYJrwKY

  • https://home.proffernandamaciel.com.br/

Data

Data to conspiracy

Bioinformatics

Data science vs Bioinformatics.

NIH: “Bioinformatics, as related to genetics and genomics, is a scientific subdiscipline that involves using computer technology to collect, store, analyze and disseminate biological data and information, such as DNA and amino acid sequences or annotations about those sequences.”

Bioinformatics

RNAseq

Advancing RNA-Seq analysis, Nature Biotechnology 28:421-423